智能论文笔记

3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform

Yining Zhao , Chao Wen , Zhou Xue , Yue Gao

分类：计算机视觉

2022-07-19

在单个全景图像对3D房间布局的估计中，全局线框可以通过全局线框进行紧密描述。基于此观察，我们提出了一种替代方法，通过对可学习的霍夫变换块中的远程几何模式进行建模，以估算3D空间中的壁。我们将图像特征从库emap瓷砖转换为曼哈顿世界的霍夫空间，并将该功能直接映射到几何输出。卷积层不仅学习了局部梯度式的线特征，而且还利用全局信息成功预测具有简单网络结构的遮挡墙。与以前的大多数工作不同，预测是在每个Cubemap瓷砖上单独执行的，然后组装以获取布局估计。实验结果表明，我们在预测准确性和性能方面获得了可比的结果。代码可在https://github.com/starrah/dmh-net上找到。

translated by 谷歌翻译

Optimization of Image Transmission in a Cooperative Semantic Communication Networks

Wenjing Zhang , Yining Wang , Mingzhe Chen , Tao Luo , Dusit Niyato

分类：人工智能 | 计算机视觉

2023-01-01

In this paper, a semantic communication framework for image transmission is developed. In the investigated framework, a set of servers cooperatively transmit images to a set of users utilizing semantic communication techniques. To evaluate the performance of studied semantic communication system, a multimodal metric is proposed to measure the correlation between the extracted semantic information and the original image. To meet the ISS requirement of each user, each server must jointly determine the semantic information to be transmitted and the resource blocks (RBs) used for semantic information transmission. We formulate this problem as an optimization problem aiming to minimize each server's transmission latency while reaching the ISS requirement. To solve this problem, a value decomposition based entropy-maximized multi-agent reinforcement learning (RL) is proposed, which enables servers to coordinate for training and execute RB allocation in a distributed manner to approach to a globally optimal performance with less training iterations. Compared to traditional multi-agent RL, the proposed RL improves the valuable action exploration of servers and the probability of finding a globally optimal RB allocation policy based on local observation. Simulation results show that the proposed algorithm can reduce the transmission delay by up to 16.1% compared to traditional multi-agent RL.

translated by 谷歌翻译

DIAMOND: Taming Sample and Communication Complexities in Decentralized Bilevel Optimization

Peiwen Qiu , Yining Li , Zhuqing Liu , Prashant Khanduri , Jia Liu , Ness B. Shroff , Elizabeth Serena Bentley , Kurt Turck

分类：机器学习

2022-12-05

Decentralized bilevel optimization has received increasing attention recently due to its foundational role in many emerging multi-agent learning paradigms (e.g., multi-agent meta-learning and multi-agent reinforcement learning) over peer-to-peer edge networks. However, to work with the limited computation and communication capabilities of edge networks, a major challenge in developing decentralized bilevel optimization techniques is to lower sample and communication complexities. This motivates us to develop a new decentralized bilevel optimization called DIAMOND (decentralized single-timescale stochastic approximation with momentum and gradient-tracking). The contributions of this paper are as follows: i) our DIAMOND algorithm adopts a single-loop structure rather than following the natural double-loop structure of bilevel optimization, which offers low computation and implementation complexity; ii) compared to existing approaches, the DIAMOND algorithm does not require any full gradient evaluations, which further reduces both sample and computational complexities; iii) through a careful integration of momentum information and gradient tracking techniques, we show that the DIAMOND algorithm enjoys $\mathcal{O}(\epsilon^{-3/2})$ in sample and communication complexities for achieving an $\epsilon$-stationary solution, both of which are independent of the dataset sizes and significantly outperform existing works. Extensive experiments also verify our theoretical findings.

translated by 谷歌翻译

xTrimoABFold: Improving Antibody Structure Prediction without Multiple Sequence Alignments

Yining Wang , Xumeng Gong , Shaochuan Li , Bing Yang , YiWu Sun , Chuan Shi , Hui Li , Yangang Wang , Cheng Yang , Le Song

分类：人工智能

2022-11-30

In the field of antibody engineering, an essential task is to design a novel antibody whose paratopes bind to a specific antigen with correct epitopes. Understanding antibody structure and its paratope can facilitate a mechanistic understanding of its function. Therefore, antibody structure prediction from its sequence alone has always been a highly valuable problem for de novo antibody design. AlphaFold2, a breakthrough in the field of structural biology, provides a solution to predict protein structure based on protein sequences and computationally expensive coevolutionary multiple sequence alignments (MSAs). However, the computational efficiency and undesirable prediction accuracy of antibodies, especially on the complementarity-determining regions (CDRs) of antibodies limit their applications in the industrially high-throughput drug design. To learn an informative representation of antibodies, we employed a deep antibody language model (ALM) on curated sequences from the observed antibody space database via a transformer model. We also developed a novel model named xTrimoABFold to predict antibody structure from antibody sequence based on the pretrained ALM as well as efficient evoformers and structural modules. The model was trained end-to-end on the antibody structures in PDB by minimizing the ensemble loss of domain-specific focal loss on CDR and the frame-aligned point loss. xTrimoABFold outperforms AlphaFold2 and other protein language model based SOTAs, e.g., OmegaFold, HelixFold-Single, and IgFold with a large significant margin (30+\% improvement on RMSD) while performing 151 times faster than AlphaFold2. To the best of our knowledge, xTrimoABFold achieved state-of-the-art antibody structure prediction. Its improvement in both accuracy and efficiency makes it a valuable tool for de novo antibody design and could make further improvements in immuno-theory.

translated by 谷歌翻译

GreenPLM: Cross-lingual pre-trained language models conversion with (almost) no cost

Qingcheng Zeng , Lucas Garay , Peilin Zhou , Dading Chong , Yining Hua , Jiageng Wu , Yikang Pan , Han Zhou , Jie Yang

分类：自然语言处理

2022-11-13

While large pre-trained models have transformed the field of natural language processing (NLP), the high training cost and low cross-lingual availability of such models prevent the new advances from being equally shared by users across all languages, especially the less spoken ones. To promote equal opportunities for all language speakers in NLP research and to reduce energy consumption for sustainability, this study proposes an effective and energy-efficient framework GreenPLM that uses bilingual lexicons to directly translate language models of one language into other languages at (almost) no additional cost. We validate this approach in 18 languages and show that this framework is comparable to, if not better than, other heuristics trained with high cost. In addition, when given a low computational cost (2.5\%), the framework outperforms the original monolingual language models in six out of seven tested languages. We release language models in 50 languages translated from English and the source code here.

translated by 谷歌翻译

METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets

Peilin Zhou , Zeqiang Wang , Dading Chong , Zhijiang Guo , Yining Hua , Zichang Su , Zhiyang Teng , Jiageng Wu , Jie Yang

分类：自然语言处理

2022-09-28

Covid-19-Pandemic继续在社交媒体上提出各种讨论或辩论的主题。为了探索大流行对人们生活的影响，了解公众对与大流行有关的实体（例如药物，疫苗）对社交媒体的关注和态度至关重要。但是，对现有命名实体识别（NER）或目标情感分析（TSA）数据集培训的模型具有有限的理解与COVID相关的社交媒体文本的能力有限，因为这些数据集并未从医学角度设计或注释。本文释放了Mets-COV，这是一种包含医疗实体的数据集和与COVID相关的推文中的目标情感。 Mets-COV包含10,000条带有7种实体的推文，包括4种医疗实体类型（疾病，药物，症状和疫苗）和3种通用实体类型（人，位置和组织）。为了进一步调查推文用户对特定实体的态度，选择了4种类型的实体（人，组织，药物和疫苗），并用用户情感注释，从而产生了具有9,101个实体（5,278个推文）的目标情感数据集。据我们所知，METS-COV是第一个收集与COVID相关推文的医疗实体和相应情感的数据集。我们通过广泛的实验对经典机器学习模型和最先进的深度学习模型进行基准测试。结果表明，该数据集在NER和TSA任务方面都有大量改进的空间。 METS-COV是开发更好的医学社交媒体工具并促进计算社会科学研究的重要资源，尤其是在流行病学方面。我们的数据，注释准则，基准模型和源代码公开可用（https://github.com/ylab-open/mets-cov），以确保可重复性。

translated by 谷歌翻译

Detecting Political Biases of Named Entities and Hashtags on Twitter

Zhiping Xiao , Jeffrey Zhu , Yining Wang , Pei Zhou , Wen Hong Lam , Mason A. Porter , Yizhou Sun

分类：机器学习

2022-09-16

美国的意识形态分裂在日常交流中变得越来越突出。因此，关于政治两极分化的许多研究，包括最近采取计算观点的许多努力。通过检测文本语料库中的政治偏见，可以尝试描述和辨别该文本的两极分性。从直觉上讲，命名的实体（即，用作名词的名词和短语）和文本中的标签经常带有有关政治观点的信息。例如，使用“支持选择”一词的人可能是自由的，而使用“亲生生命”一词的人可能是保守的。在本文中，我们试图揭示社交媒体文本数据中的政治极性，并通过将极性得分分配给实体和标签来量化这些极性。尽管这个想法很简单，但很难以可信赖的定量方式进行这种推论。关键挑战包括少数已知标签，连续的政治观点，以及在嵌入单词媒介中的极性得分和极性中性语义含义的保存。为了克服这些挑战，我们提出了极性感知的嵌入多任务学习（PEM）模型。该模型包括（1）自制的上下文保护任务，（2）基于注意力的推文级别的极性推导任务，以及（3）对抗性学习任务，可促进嵌入式的极性维度及其语义之间的独立性方面。我们的实验结果表明，我们的PEM模型可以成功学习极性感知的嵌入。我们检查了各种应用，从而证明了PEM模型的有效性。我们还讨论了我们的工作的重要局限性，并在将PEM模型应用于现实世界情景时的压力谨慎。

translated by 谷歌翻译

Map Container: A Map-based Framework for Cooperative Perception

Kun Jiang , Yining Shi , Benny Wijaya , Mengmeng Yang , Tuopu Wen , Zhongyang Xiao , Diange Yang

分类：机器人

2022-08-28

合作感知的想法是从多辆车之间的共同感知数据中受益，并克服单车上车载传感器的局限性。但是，由于本地化不准确，通信带宽和模棱两可的融合，多车信息的融合仍然具有挑战性。过去的实践通过放置精确的GNSS定位系统来简化问题，手动指定连接的车辆数量并确定融合策略。本文提出了一个基于地图的合作感知框架，名为MAP容器，以提高合作感的准确性和鲁棒性，最终克服了这个问题。概念“地图容器”表示地图是将所有信息转换为地图坐标空间的平台，并将不同的信息源合并到分布式融合体系结构中。在拟议的MAP容器中，考虑使用GNSS信号和传感器功能和地图功能之间的匹配关系以优化环境状态的估计。对仿真数据集和房地车平台的评估结果验证了所提出的方法的有效性。

translated by 谷歌翻译

Bridging the View Disparity of Radar and Camera Features for Multi-modal Fusion 3D Object Detection

Taohua Zhou , Yining Shi , Junjie Chen , Kun Jiang , Mengmeng Yang , Diange Yang

分类：计算机视觉

2022-08-25

雷达和摄像机多模式融合的环境感知对于自动驾驶至关重要，以提高准确性，完整性和稳健性。本文着重于如何利用毫米波（MMW）雷达和相机传感器融合进行3D对象检测。提出了一种新的方法，该方法在提出了更好的特征表示形式下意识到在鸟眼视图（BEV）下的特征级融合。首先，将雷达特征通过时间积累增强，并发送到时间空间编码器以进行雷达特征提取。同时，通过图像骨干和颈部模型获得了适应各种空间尺度的多尺度图像2D特征。然后，将图像功能转换为使用设计的视图变压器。此外，这项工作将多模式特征与称为点融合和ROI融合的两阶段融合模型融合在一起。最后，检测头会回归对象类别和3D位置。实验结果表明，所提出的方法在最重要的检测指标，平均平均精度（MAP）和NUSCENES检测分数（NDS）下实现了最先进的性能。

translated by 谷歌翻译

Performance Optimization for Semantic Communications: An Attention-based Reinforcement Learning Approach

Yining Wang , Mingzhe Chen , Tao Luo , Walid Saad , Dusit Niyato , H. Vincent Poor , Shuguang Cui

分类：人工智能

2022-08-17

在本文中，提出了用于文本数据传输的语义通信框架。在研究的模型中，基站（BS）从文本数据中提取语义信息，并将其传输到每个用户。语义信息由由一组语义三元组组成的知识图（kg）建模。收到语义信息后，每个用户都使用图形到文本生成模型恢复原始文本。为了衡量所考虑的语义通信框架的性能，提出了共同捕获恢复文本的语义准确性和完整性的语义相似性（MSS）的指标。由于无线资源限制，BS可能无法将整个语义信息传输给每个用户并满足传输延迟约束。因此，BS必须为每个用户选择适当的资源块，并确定和将一部分语义信息传输给用户。因此，我们制定了一个优化问题，其目标是通过共同优化资源分配策略并确定要传输的部分语义信息来最大化总MSS。为了解决这个问题，提出了与注意力网络集成的基于近端优化的强化增强学习（RL）算法。所提出的算法可以使用注意网络在语义信息中评估每个三重组的重要性，然后在语义信息中三元组的重要性分布与总MSS之间建立关系。与传统的RL算法相比，所提出的算法可以动态调整其学习率，从而确保收敛到本地最佳解决方案。

translated by 谷歌翻译